C4 Phosphoenolpyruvate Carboxylase: Evolution and transcriptional regulation

Abstract Photosynthetic phosphoenolpyruvate carboxylase (PEPC) catalyses the irreversible carboxylation of phosphoenolpyruvate (PEP), producing oxaloacetate (OAA). This enzyme catalyses the first step of carbon fixation in C4 photosynthesis, contributing to the high photosynthetic efficiency of C4 plants. PEPC is also involved in replenishing tricarboxylic acid cycle intermediates, such as OAA, being involved in the C/N balance. In plants, PEPCs are classified in two types: bacterial type (BTPC) and plant-type (PTPC), which includes photosynthetic and non-photosynthetic PEPCs. During C4 evolution, photosynthetic PEPCs evolved independently. C4 PEPCs evolved to be highly expressed and active in a spatial-specific manner. Their gene expression pattern is also regulated by developmental cues, light, circadian clock as well as adverse environmental conditions. However, the gene regulatory networks controlling C4 PEPC gene expression, namely its cell-specificity, are largely unknown. Therefore, after an introduction to the evolution of PEPCs, this review aims to discuss the current knowledge regarding the transcriptional regulation of C4 PEPCs, focusing on cell-specific and developmental expression dynamics, light and circadian regulation, as well as response to abiotic stress. In conclusion, this review aims to highlight the evolution, transcriptional regulation by different signals and importance of PEPC in C4 photosynthesis and its potential as tool for crop improvement.


The path to C 4 photosynthesis
To overcome the energy loss due to photorespiration, a process that metabolises a toxic compound generated when Rubisco acts as oxygenase, some plants have evolved a carbon concentration mechanism called C 4 photosynthesis.In most C 4 plants, CO 2 is first fixed in the mesophyll cells by PEPC, into a four-acid compound that is shuttled to the bundle sheath cells where it is decarboxylated, thus increasing the CO 2 concentration around Rubisco.In addition to the twocell type C 4 photosynthesis, a few plants have developed C 4 photosynthesis in a single-cell, where the spatial separation of the carbon fixation reactions occurs inside one cell.For instance, in the single-cell C 4 species Bienertia sinuspersici, C 4 photosynthesis is based on an intracellular compartmentation including two physiologically and biochemically different chloroplast types (Caburatan et al., 2019).Evolution of C 4 photosynthesis has occurred over 60 independent times, in both dicotyledons and monocotyledons, in one of the most amazing examples of convergent evolution known in nature (Sage et al., 2011).Despite the broad evolutionary trajectories of C 4 photosynthesis, all C 4 species rely on PEPC for the first carboxylation step (Sage et al., 2011).Many authors have tried to resolve the evolutionary origin of PEPCs and they have clearly shown that photosynthetic C 4 PEPCs from dicots and monocots evolved from different C 3 origins (Westhoff and Gowik, 2004;Christin et al., 2007;Besnard et al., 2009;Christin and Besnard, 2009).
In the dicot Flaveria genus, which contains C 3 , C 4 and C 3 -C 4 intermediate species, it is possible to distinguish 3 classes of PEPC genes (A, B, and C) (Westhoff and Gowik, 2004).PEPCs from class A are present in both C 3 and C 4 species and class A C 4 PEPCs originated from a duplication of class B PEPCs.The photosynthetic PEPCs belong to class A and originated from a duplication of class B PEPCs.Class A C 4 PEPCs (ppcA) are present in both C 3 and C 4 species, however, although these genes show variable transcript levels among species, in C 4 -like intermediate species, ppcA transcript levels are higher and similar to C 4 plants (Engelmann et al., 2003).Therefore, C 4 PEPC isoforms seem to have evolved in a stepwise fashion, with the increase of gene expression preceding amino acid changes (Westhoff and Gowik, 2004;Engelmann et al., 2003).
In the clade PACMAD (named based on its subfamilies Panicoideae, Aristidoideae, Chloridoideae, Micrairoideae, Arundinoideae, Danthonioideae), which comprises all the grass C 4 species, PEPCs have evolved over eight independent times, recruiting different C 3 PEPC isoforms to acquire the C 4 function (Christin et al., 2007;Christin and Besnard, 2009).In most grass species, the recruited isoform was ppc-B2, while in the case of Stipagrostis genus, it was ppc-A1b isoform (Christin and Besnard, 2009).In the case of sedges (Cyperaceae), the PEPC isoform recruited for C 4 photosynthesis is sister of the ppc-A1a and ppc-A1b isoforms from grasses, evolving five independent times (Besnard et al., 2009;Christin and Besnard, 2009).It is yet to be defined which amino acid changes are responsible for the evolution from a C 3 to a C 4 isoform.Despite some amino acid positions having been proposed as being under positive selection for C 4 function (Christin et al., 2007), only one amino acid substitution has been conclusively linked to the C 4 isoform of PEPC (Bläsing et al., 2000).
The substitution of an Alanine to a Serine can be found in C 4 PEPCs of several dicots and monocots, making it a key criterion for C 4 isoform definition.It occurs in position 780 in maize (Christin et al., 2007), and 774 in Flaveria species and significantly influences PEPC kinetic properties (Bläsing et al., 2000).Besides the specific protein features, PEPC transcriptional regulation in C 4 plants is tightly controlled and its essential for the proper functioning of C 4 metabolism.

Developmental regulation
In monocots and dicots, leaves differentiate following a gradient, in which younger cells are present at the leaf base, while older and more mature cells are present at the leaf tip (Nelson and Langdale, 1989;Stockhaus et al., 1997;Aubry et al., 2014).During leaf development, C 4 PEPC gene expression is regulated by developmental cues, increasing gradually from leaf base to leaf tip (Martineau and Taylor, 1985;Stockhaus et al., 1997;Pick et al., 2011;Aubry et al., 2014;Tao et al., 2022).In maize and Cleome gynandra, C 4 PEPC transcript level is higher in mature than in younger leaves (Kausch et al., 2001;Aubry et al., 2014).Since mature leaves have more differentiated M cells than younger leaves, it seems that C 4 PEPC expression level follows M cells differentiation.In fact, maize PEPC was recently identified as a target of COL8, a transcription factor (TF) co-regulated with PEPC during M cell development (Tao et al., 2022).This suggests that COL8 might regulate PEPC expression during leaf development, however further investigation is required to validate this TF as a PEPC gene regulator.A developmental regulation of C 4 PEPC gene expression was also observed in the single-cell type C 4 species Bienertia sinuspersici.In this species, gene expression analysis of PEPC isoforms showed that C 3 PEPC is more expressed in the younger leaves or early stages of development, while C 4 PEPC is upregulated in the mature stages of leaf development (Caburatan et al., 2019).However, C 4 PEPC gene expression does not follow a developmental pattern in all species.In the particular case of Amaranth, C 4 PEPC is highly expressed since the beginning of leaf development, namely in leaf primordia and in the apical meristem and surrounding regions (Ramsperger et al., 1996).
C 4 PEPC protein accumulates at different leaf development stages in a species-dependent manner (Mayfield and Taylor, 1984;Martineau and Taylor, 1985;Dengler et al., 1995;Soros and Dengler, 2001;Voznesenskaya et al., 2003;Wakayama et al., 2003;Majeran et al., 2010;Koteyeva et al., 2014) and, in general, C 4 PEPC accumulation goes along with M cells differentiation (Voznesenskaya et al., 2003;Wakayama et al., 2003;Majeran et al., 2010;Koteyeva et al., 2014).Nevertheless, the mechanisms coordinating C 4 PEPC gene expression and protein accumulation during leaf development differ among species (Langdale et al., 1988;Wang et al., 1992;Wang et al., 1993;Dengler et al., 1995;Ramsperger et al., 1996;Soros and Dengler, 2001;Voznesenskaya et al., 2003;Wakayama et al., 2003;Koteyeva et al., 2014).In the case of amaranth, in early developmental stages, C 4 PEPC gene expression does not occur in a cell-specific way, however, the expressed protein is only present in the M cell precursors (Ramsperger et al., 1996).This pattern is also observed in cotyledons and maintained in later stages of leaf development, namely during leaf unfolding (Wang et al., 1992;Wang et al., 1993).Although no information is available regarding the regulatory mechanisms underlying C 4 PEPC gene expression in amaranth during leaf development, post-transcriptional or translational regulation mechanisms seem to play the main role in regulating cell-specific C 4 PEPC protein accumulation (Wang et al., 1992;Wang et al., 1993;Ramsperger et al., 1996).In contrast, maize C 4 PEPC is expressed in a cellspecific way throughout leaf development (Langdale et al., 1988;Majeran et al., 2010).Hence, transcriptional regulatory mechanisms seem to be the most important to establish a C 4 PEPC cell-specific expression pattern in maize.Other species known to accumulate C 4 PEPC only in M cells, regardless of developmental stage, are Atriplex rosea, Arundinella hirta and two Cleome species (Dengler et al., 1995;Wakayama et al., 2003;Koteyeva et al., 2014), however, the regulatory mechanisms underlying this feature are not known.A different example is Salsola richteri, in which C 4 PEPC protein starts to accumulate in a non-cell specific way at early stages, being present in BS and M cells, and other leaf cells albeit at lower levels, but, in later stages of leaf development, C 4 PEPC is detected exclusively in M cells (Voznesenskaya et al., 2003).The mechanisms regulating S. richteri C 4 PEPC cellspecific accumulation are also unknown.Similarly to Salsola richteri, in two Cyperaceae species, Pycreus polystachyos and Eleocharis retrofiexa, C 4 PEPC accumulation only becomes cell-specific later in leaf development (Soros and Dengler, 2001).In Eleocharis retrofiexa C 4 PEPC accumulation is also present in the parenchymatous BS (PBS), suggesting that PBS and M cells have similar functions (Soros and Dengler, 2001).In the particular case of Rhynchospora rubra, another Cyperaceae species, C 4 PEPC never accumulates in a cell-specific way throughout leaf development, suggesting that Rhynchospora rubra may have a different version of C 4 photosynthesis (Soros and Dengler, 2001).Although these three species belong to the same family, the differences regarding C 4 PEPC accumulation may be related to the different C 4 origins they represent and to the differences in the anatomical features between species (Soros and Dengler, 2001).
The fact that C 4 PEPC gene expression and protein accumulation patterns during leaf development differ among species shows that different species acquired different developmental regulatory mechanisms during C 4 evolution, which is not surprising given the evolutionary convergence of C 4 photosynthesis.To better understand these regulatory mechanisms, more information regarding C 4 PEPC transcriptional regulation in different species during their leaf development is needed.

Spatial regulation
In most C 4 plants, photosynthetic reactions are divided into two different cell types, M and BS cells.As stated in section 1b, C 4 PEPC first fixes CO 2 in M cells, where it is highly and specifically expressed (Sage, 2004).This expression pattern required the development of a complex regulatory network during C 4 evolution.It has been suggested that the transcriptional mechanisms regulating non-photosynthetic PEPC gene expression were modified to reach a high and cell-specific transcript level (Williams et al., 2012).The recruitment of cis-elements and TFs regulating C 3 genes was essential to achieve this purpose (Williams et al., 2012).
In maize, the C 4 PEPC promoter (C 4 ZmPEPC promoter) drives a leaf-specific expression.Despite some gene expression in some leaf-like organs, the C 4 ZmPEPC promoter shows a very high activity in leaves as compared with other mature tissues, such as roots and stems, in which no activity is detected (Kausch et al., 2001).Dof1 and Dof2 are two TFs identified as putative regulators of C 4 PEPC organ-specific gene expression in maize (Yanagisawa and Sheen, 1998) (Figure 3).Dof1 is a ubiquitously expressed TF, working as a light-dependent activator, while Dof2 is only expressed in roots and stems and acts as a repressor (Yanagisawa and Sheen, 1998).In vivo experiments demonstrated that when Dof2 is expressed, it binds to the C 4 PEPC promoter, impairing Dof1 binding and consequently promoter activation (Yanagisawa and Sheen, 1998).Therefore, it was hypothesised that, in stems and roots, Dof2 binds to the C 4 PEPC promoter, blocking Dof1 DNA interaction and, consequently, down-regulating C 4 PEPC transcript levels in these tissues (Figure 3A).In leaves, Dof1 is free to bind to the C 4 PEPC promoter, thus activating it (Figures 3B and 3C) (Yanagisawa and Sheen, 1998).However, contrasting with this hypothesis, the knockout of Dof1 does not affect C 4 PEPC expression levels, implying that this TF does not have a prominent role in C 4 PEPC transcriptional regulation (Cavalar et al., 2007).Another possibility is the existence of transcriptional redundancy by other Dof TFs or even TFs from other families.If this is true, the knockout of Dof1 may not be sufficient to affect C 4 PEPC expression levels.Hence, the identification of other TFs regulating C 4 PEPC gene expression will be useful to understand how TFs regulate C 4 PEPC expression in a tissue-specific way.
Recently, three additional maize TFs, ZmbHLH80, ZmbHLH90, and ZmOrphan94 have been identified as putative regulators of C 4 PEPC cell-specific gene expression, having binding sites in the promoter regions known to be crucial to establish this expression pattern (Górska et al., 2019;Gupta et al., 2020;Górska et al., 2021) (Figures 3A and 3B).ZmbHLH90 was shown to act as an activator of C 4 ZmPEPC, while ZmbHLH80 and ZmOrphan94 act as repressors (Górska et al., 2019;Górska et al., 2021).It was proposed that both repressors, ZmbHLH80 and ZmOrphan94, play an important role in C 4 PEPC cell-specific gene expression keeping its expression low in the BS cells, where they are preferentially expressed.The high ZmbHLH80 and ZmOrphan94 gene expression in the BS cells may lead to the formation of heterodimers with the activator ZmbHLH90, thus impairing its function (Górska et al., 2019;Górska et al., 2021) (Figure 3B).In M cells, ZmbHLH80 and ZmOrphan94 are less expressed and, therefore, ZmbHLH90 is free to form homodimers and thus activate C 4 ZmPEPC expression (Górska et al., 2019;Górska et al., 2021).We must however emphasise that, though it was clearly shown that ZmbHLH80 and ZmOrphan94 transcript levels are higher in BS as compared with M cells, nothing is known about their protein abundance.In addition to the negative regulation by heterodimerization, we may have other regulation mechanisms between activators and repressors, such as competition for the same binding site, interaction after DNA binding or a stronger regulatory effect of repressors over activators (Górska et al., 2021) (Figure 3).It would be interesting to investigate whether these new identified TFs interact with the TFs previously identified and, if they interact, how they function to regulate C 4 PEPC gene expression.One could also hypothesise that a double mutant Dof1/ZmbHLH90 might be needed to affect C 4 ZmPEPC gene expression.
In addition to TFs, cis-elements in the C 4 PEPC promoter have also been associated with the mesophyll cell-specific gene expression (Gowik et al., 2004;Akyildiz et al., 2007;Gupta et al., 2020).Interestingly, it has been reported that C 4 PEPC promoter regions underpinning cell-specific expression are different between dicots and monocots (Gowik et al., 2004;Akyildiz et al., 2007;Engelmann et al., 2008;Gupta et al., 2020).In dicots, such as Flaveria species, a region of the distal promoter (2141 to 1566 bps before ATG) of C 4 PEPC is responsible to establish the spatial expression pattern, while the proximal promoter region (570 bps before ATG) works as an enhancer of C 4 PEPC expression, being both necessary for high and cell-specific expression levels (Gowik et al., 2004;Akyildiz et al., 2007;Engelmann et al., 2008).When the C 4 PEPC proximal promoter region was isolated, no cell-specificity was observed.On the other hand, when the proximal promoter region was replaced by its C 3 counterpart, although cell-specificity was maintained a decrease in promoter strength was observed (Gowik et al., 2004;Akyildiz et al., 2007;Engelmann et al., 2008).Although some cis-elements have been identified as putative enhancers within the proximal promoter, their role in C 4 PEPC expression was never proven (Engelmann et al., 2008).Deletions in the distal promoter, however, showed that a cis-element designated mesophyll expression module 1 (MEM1) is essential for a cell-specific expression.Without this element, or when it is replaced by its C 3 counterpart, the M cell specificity is lost (Gowik et al., 2004;Akyildiz et al., 2007).In contrast to Flaveria species, the C 4 PEPC proximal promoter (~500 bps) from grasses (monocots) is sufficient to drive a high M cell-specific expression, thus having all the necessary cis-elements to achieve cell-specificity (Schaffner and Sheen, 1992;Taniguchi et al., 2000;Gupta et al., 2020).Within this region, four conserved nucleotide sequences (CNSs) were identified as essential cis-elements for an M cell-specific expression (Gupta et al., 2020).When the CNSs were eliminated from the C 4 PEPC promoter, the promoter activity was almost eliminated, being rescued when the original CNSs were replaced by equivalent sequences from a different C 4 grass species (Gupta et al., 2020).
In addition to the cis and trans factors, some epigenetic modifications might be involved in C 4 PEPC gene expression regulation.Tri-methylation (H3K4me3) and di-methylation (H3K4me2) states, found in C 4 PEPC proximal promoter and transcribed regions, seem to be associated with the establishment of C 4 PEPC cell-specific expression (Danker et al., 2008;Heimann et al., 2013).These epigenetic modifications seem to have antagonistic effects as an enrichment of H3K4me3 in M cells and of H3K4me2 in BS cells is observed in several grass species (Danker et al., 2008;Heimann et al., 2013).Based on this evidence, it was proposed that a methyltransferase is recruited in a cell-specific way to convert low histone methylation states, such as HeK4me2, established by default in C 4 PEPC, in HeK4me3 enabling promoter activation (Danker et al., 2008).
A few studies have identified unmethylated CpG islands in the C 4 PEPC promoter (Langdale et al., 1991;Tolley et al., 2012).These regions, along with H3K4me3 may maintain an open chromatin state.Despite these CpG islands being unmethylated in both M and BS cells, a similar hypothesis regarding the recruitment of a methyltransferase has been proposed (Tolley et al., 2012).This way, an open chromatin conformation is maintained, and transcription can be induced in M cells (Tolley et al., 2012).Nevertheless, the identification and functional characterization of such methyltransferase(s) or de-methylase(s) is still to be carried out.
Although progress has been made over the last years towards a better understanding of the gene regulatory mechanisms underlying C 4 PEPC cell-specific gene expression, there is still a lot more to be unveiled.More progress has been done regarding the characterization of important ciselements than in the identification and characterization of key trans-factors regulating C 4 PEPC cell-specificity.Although some TFs have been identified as binding to the C 4 PEPC promoter and as putative regulators of C 4 PEPC cell-specific gene expression, the key players are still missing.It is still to be identified the key TF or TFs that promote or impair C 4 PEPC cell-specific gene expression.Therefore, we believe that more effort is necessary to identify new TFs regulating C 4 PEPC gene expression and to understand the signalling pathways and the regulatory networks involved.

Diel regulation
The circadian clock is an internal mechanism that regulates several biological processes, including C 4 photosynthesis (Khan et al., 2010).Although the effects of the circadian clock on C 4 PEPC gene expression remain largely unknown, a few studies have shown that similarly to other C 4 genes, C 4 PEPC gene expression has a circadian regulation (Horst et al., 2009;Khan et al., 2010).C 4 PEPC is an early morning phasing gene and, despite its light regulation, it presents an oscillatory rhythm under constant light (Horst et al., 2009;Khan et al., 2010;Xu et al., 2016).In the maize C 4 PEPC distal promoter region (1300 bps before ATG), some histone acetylation sites, such as H3K9ac, which has a high correlation with transcription activation, show circadian oscillation, maintaining its rhythmicity and high amplitude levels under constant light (Horst et al., 2009).These observations show that, though regulators of C 4 PEPC cell-specific gene expression are located within the first 500 bp upstream of the translational start codon (Gupta et al., 2020), the distal promoter region (1300 bps before ATG) might be more related to the C 4 PEPC gene expression level, as well as with the circadian regulation.
It was shown that, during the night period of a diel cycle, histone acetylation is not totally removed (Offermann et al., 2006).These intermediary histone acetylation levels found during this period, contrast with the low acetylation levels found in this gene after a long period of dark exposure (Offermann et al., 2006).Therefore, it was proposed that light regulates histone acetyltransferases (HATs), being also active under dark conditions to maintain steady-state acetylation levels (Offermann et al., 2006).Therefore, one can hypothesise that HATs' activity or expression levels may also be regulated by the circadian clock.Nevertheless, it was shown that high histone acetylation of the C 4 PEPC promoter may not be enough to induce transcription.In maize, the treatment of darkened plant leaves with a histone deacetylase (HDAC) inhibitor did not alter C 4 PEPC gene expression (Offermann et al., 2006).
As described above, ZmbHLH80 and ZmbHLH90 participate in C 4 PEPC regulation (Górska et al., 2019).Interestingly, in Arabidopsis thaliana, FBH1, a homologous TF to ZmbHLH80 and ZmbHLH90, is involved in the circadian rhythm regulation by repressing the CCA1 gene expression (Nagel et al., 2014).FBH1 is also involved in the CCA1 regulation in response to warm temperatures (Nagel et al., 2014).It would be interesting to understand if this mechanism is conserved in maize, and other C 4 species, and to unveil the regulators involved.This will help us to better understand how C 4 PEPC and, eventually, other C 4 genes are regulated by the circadian rhythm.
Despite the molecular mechanisms underlying C 4 PEPC light regulation being still unclear, this gene is known to be light-regulated at different levels.In C 4 PEPC distal promoter (between 3178 and 2908 bps before ATG) four cytosine residues were identified as differentially methylated in plants grown under different light conditions (Langdale et al., 1991;Tolley et al., 2012).These residues are less methylated in M cells of green leaves, compared with etiolated leaves or roots (Langdale et al., 1991;Tolley et al., 2012).In greening leaves, an increase in demethylation of two of these cytosine residues was also observed within 48h of light exposure (Langdale et al., 1991).However, although the demethylation of these residues has a good correlation with the increase of C 4 ZmPEPC transcript levels, it does not seem to be important for the cellspecific transcription of this gene, since its proximal promoter region is sufficient to drive M cell-specific expression (Tolley et al., 2012;Gupta et al., 2020).Nevertheless, it is possible that upstream differentially-methylated regions can act as enhancers of C 4 ZmPEPC expression in M cells, being their contribution to C 4 PEPC expression still unclear (Tolley et al., 2012).
In greening maize leaves, the chromatin of the proximal promoter region (500 bps before ATG) has an open state, compared with the chromatin of the same region in etiolated leaves, showing that light modulates chromatin dynamics of this region of C 4 PEPC promoter (Kalamajka et al., 2003).In species from different C 4 evolution origins, some histone acetylation sites in both coding and promoter regions of C 4 PEPC are regulated by light (Table 1) (Offermann et al., 2006;Offermann et al., 2008;Horst et al., 2009;Heimann et al., 2013).A comparison between both distal and proximal C 4 ZmPEPC promoter regions revealed that acetylation levels have a stronger light response and higher correlation with transcription in C 4 ZmPEPC distal promoter regions (Horst et al., 2009).This further supports the idea that the distal promoter of C 4 PEPC may contribute as an enhancer of C 4 PEPC gene expression.
To control C 4 PEPC acetylation levels, light modulates histone deactylases' (HDACs) activity (Offermann et al., 2006;Offermann et al., 2008).During the night period, some HDACs are activated to deacetylate the C 4 PEPC promoter.During the day, although some HDACs are repressed, others are activated to maintain the steady-state histone acetylation levels (Offermann et al., 2006;Offermann et al., 2008).This shows that HDACs seem to be important to regulate the acetylation levels of C 4 PEPC, however the HDACs involved in this regulation remain to be identified.It has long been known that light has an important role in modulating the binding of proteins to the C 4 PEPC promoter (Kano-Murakami et al., 1991).In vitro experiments showed that nuclear factors extracted from green maize leaves are able to bind to the C 4 ZmPEPC promoter, whilst the nuclear factors extracted from etiolated maize leaves are not.(Kano-Murakami et al., 1991) A good example of a TF binding to the C 4 PEPC promoter in a light-dependent manner is Dof1, whose activity is modulated by light (Yanagisawa and Sheen, 1998).Dof1 can induce higher C 4 PEPC promoter activity in greening as compared with etiolated protoplasts (Yanagisawa and Sheen, 1998).Since both blue and red light induce the expression of C 4 PEPC, it seems that both phytochrome and the cryptochrome pathways contribute to the regulation of C 4 PEPC gene.However, the downstream players of this regulation remain to be unveiled (Hendron and Kelly, 2020).Being light an important stimulus regulating C 4 PEPC expression, it would be interesting to identify and characterize more TFs that regulate C 4 PEPC in response to light and unveil the regulatory mechanisms of the different photoreceptors.
Besides light playing a crucial role in regulating C 4 PEPC gene expression, the signals originated from the interplay between light and chloroplast development seem to be relevant for C 4 PEPC regulation (Kausch et al., 2001;Burgess et al., 2016).The inhibition of chloroplast development reduces the activation of the C 4 ZmPEPC promoter and an increase in C 4 ZmPEPC expression was observed in greening maize seedlings (Kausch et al., 2001;Burgess et al., 2016).Although one can hypothesise that chloroplast development is a relevant component of C 4 PEPC gene expression regulation, the regulatory mechanisms are still unknown.
Despite being a crucial environmental cue regulating C 4 PEPC gene expression, the regulatory mechanisms underlying light response need to be further investigated to better understand this topic.It would be interesting to unveil the regulatory mechanisms involved in the epigenetic modifications of C 4 PEPC promoter in response to light and understand their relevance for C 4 photosynthesis.The identification of TFs and cis-elements and downstream players of the different photoreceptor pathways involved in the regulation of C 4 PEPC is also important for understanding the light regulatory networks.Finally, retrograde signalling is a rather unexplored topic regarding C 4 PEPC expression.Since it seems to be a relevant component of C 4 PEPC regulation, it would be important to understand the regulatory mechanisms involved in this process and the interplay between light and retrograde signalling.

Response of C 4 PEPC to adverse environmental conditions
Plants are sessile organisms that cannot escape from adverse environmental conditions.To cope with such conditions, plants need to re-arrange their metabolism.Photosynthesis is a key process for life on Earth, being essential for many different ecosystems.Alterations in this metabolic pathway can lead to serious decreases in plant yield, which is detrimental to our current agricultural systems.It is of utmost importance to understand how the adverse environmental conditions modulate the photosynthetic metabolism.Given the importance of C 4 photosynthesis, it is particularly important to understand how this metabolism is affected by different environmental stresses.One of the key enzymes in C 4 photosynthesis is C 4 PEPC, but the mechanisms by which this protein is regulated under stress conditions remain unclear.Here we summarise the current knowledge regarding the effects of various stress conditions on C 4 PEPC gene expression.Table 2 summarises the reported effects of different abiotic stresses on C 4 PEPC levels.

Osmotic stress
Different adverse environmental conditions alter the osmotic balance within the cell, leading to osmotic stress.These conditions include for instance water deficit, salt stress (osmotic component), or osmolyte pressure (e.g.PEGmediated drought).Although some studies have investigated the impact of osmotic stress in C 4 plants it is still not clear its effect on the C 4 cycle, with many authors claiming that the CBB cycle is the major limiting step in osmotic stress tolerance in C 4 plants.
Several reports have shown a decrease in C 4 PEPC expression and activity in response to water deficit (Pelleschi et al., 1997;Foyer et al., 1998) but other authors have seen an increase of its activity under water deficit (Ghannoum, 2009).An increase in PEPC levels would raise the initial carboxylation of atmospheric CO 2 and increase the carbon flux to BS.If not accompanied by an increase of Rubiscomediated carboxylation, this increase would lead to decreased net carbon fixation, and subsequent CO 2 leakage.Major effect of osmotic stress is the decrease of photosynthetic rate in both C 3 and C 4 plants.It has been proposed that, in C 4 plants, an increase of non-used CO 2 in the BS cells (i.e.↑[CO 2 ] BS ) leads to CO 2 leakage and subsequent decrease in net photosynthesis (Ghannoum, 2009), which could be linked with the changes in PEPC levels described in some works.Jeanneau et al., 2002 tested the effect of overexpression of Sorghum bicolor C 4 PEPC in drought tolerance in maize.They observed an increase in carbon assimilation rates in lines with increased C 4 PEPC expression and a decrease in the lines with decreased C 4 PEPC expression, as it was expected.In terms of drought tolerance, no effect of the overexpression of C 4 PEPC in severe drought conditions was observed, but plants showed a higher water use efficiency in mild-drought conditions.Together, C 4 PEPC plays a role in regulating the carbon flux from M to BS cells, the increase of this flow may be beneficial in the early stages of drought but under more severe water deficit it becomes irrelevant.Overexpression of C 4 PEPC alone seems to lead to an increase in transported CO 2 that may not be efficiently used by Rubisco, either by Rubisco limitation or decarboxylation inefficiency, possibly due to a lack of increase in decarboxylation enzymes (e.g.NADP-ME).
Under salt stress, C 4 plants showed higher PEPC activity contrary to C 3 plants (Hatzig et al., 2010).There are no insights showing that this increase is linked to upregulation of photosynthesis but rather for the anaplerotic role of PEPC.It would be interesting to understand which component of the salt stress (osmotic or ionic) is indeed responsible for the upregulation of PEPC and which PEPCs are regulated at transcriptional level.
Work on Sorghum bicolor, analysed the genome wide transcriptional response to salt, PEG and ABA stress in both shoot and roots (Buchanan et al., 2005).In terms of C 4 PEPC transcripts, it was observed an upregulation upon salt stress in both roots and shoots, which is in agreement with previous work in maize (Hatzig et al., 2010).PEG induced osmotic stress led to down regulation in roots but no changes in shoots, which is contrary to previous results in maize where either upregulation (Ghannoum, 2009) or downregulation (Pelleschi et al., 1997;Foyer et al., 1998) of C 4 PEPC was observed.Abscisic acid treatment, a key hormone in stress response, leads to no change in PEPC transcript.
Most genome wide studies in maize show no significant transcriptional response for C 4 ZmPEPC, in both biotic and abiotic stresses [data obtained via Genevestigator (https:// genevestigator.com/)].

Temperature stress
High and low temperatures affect photosynthesis in both C 3 and C 4 plants.C 4 plants are considered to be more sensitive to cold stress than C 3 plants, due to the cold-labile feature of some C 4 enzymes (Long, 1983).Plants that are more tolerant to low temperature usually show a higher accumulation of photosynthesis related enzymes, like Rubisco (Yamori et al., 2014).It was therefore expected that C 4 plants under cold stress accumulated C 4 related enzymes to counterbalance their reduced activity.Contrary to what was expected, C 4 plants seem to show a decrease in PEPC activity under cold (Selinioti et al., 1985;Angelopoulos and Gavalas, 1988;Chinthapalli et al., 2003).It would be important to understand the transcriptional regulation and how knock-out or overexpression of C 4 PEPC would affect temperature tolerance.
Although cold decreases C 4 PEPC activity, this effect is reversible when plants are placed back on optimal conditions.Though changes in activity its many times related to the phosphorylation of C 4 PEPC, (Chinthapalli et al., 2003) showed that there are no changes in the phosphorylation status of C 4 PEPC when treated with different temperature conditions, thus refuting the hypothesis of regulation by phosphorylation.The same study showed that C 4 PEPC has increased activity at higher temperatures, in a way that is remarkably different from its C 3 counterpart.On the other hand, (Crafts-Brandner and Salvucci, 2002) showed that C 4 PEPC activity is rather insensitive to increase in temperature, although photosynthesis was reduced at temperatures higher than 40ºC.It would be important to investigate how different temperature conditions regulated C 4 PEPC gene expression and how this correlates with photosynthesis efficiency.

Nitrogen levels regulation
Nitrogen deficiency is well known to cause a down regulation of C 4 PEPC transcript and protein levels, in maize  Sugiharto et al., 1992;Sugiharto et al., 1990 Cadmium excess Zea mays Decrease activity Wang et al., 2009 Ozone excess Zea mays Decrease transcript/protein Leitao et al., 2007b;Leitao et al., 2007a leaves (Sugiharto et al., 1990;Schlüter et al., 2012).On the other hand, upon nitrogen treatment, regardless of the form supplied (nitrate or ammonium), C 4 PEPC transcript level and activity are significantly up regulated in maize (Sugiharto and Sugiyama, 1992;Suzuki et al., 1994).This up regulation is thought to be mediated by Glutamic acid, as its addition leads to an upregulation of the C 4 PEPC gene expression and the inhibition of its synthesis leads to a down regulation (Sugiharto et al., 1992).Nevertheless, the addition of ammonium does not affect the C 4 PEPC gene expression in sorghum (Arias-Baldrich et al., 2017), indicating that regulation of C 4 PEPC gene expression by nitrate or ammonium treatment may differ even among close C 4 species.The fact that C 4 PEPC gene expression can be modulated by nitrogen levels shows an intrinsic interplay between carbon and nitrogen metabolism, which may have been co-opted during C 4 evolution.

Other stresses
It has been reported that cadmium affects the growth of maize plants by disturbing the light and carbon reactions of photosynthesis.High cadmium levels lead to a down regulation of C 4 PEPC activity in maize, with the dosage affecting the time needed to see the effects (Wang et al., 2009).Whether this regulation takes place at the transcriptional level is not known.
Atmospheric conditions can also affect photosynthesis, namely the increase in ozone concentration.It has been shown that increase in atmospheric ozone led to impacts in maize growth and in its photosynthetic potential.Although the light harvesting complex is affected at relatively low increases of ozone, the carbon fixation reactions namely PEPC and Rubisco, are only affected at higher concentration with a reduction in protein amount and transcript (Leitao et al., 2007a, b).

Concluding remarks
During plant evolution, PEPCs evolved from bacterial PEPCs, after an ancestral duplication, when Viridiplantae arose.In C 3 plants, PEPC is an important enzyme for plant development since it works as a link between carbon and nitrogen metabolism.Later, during C 4 evolution, PEPC was recruited independently several times to incorporate the C 4 cycle, by performing the first step of CO 2 fixation.However, to obtain the features required for C 4 photosynthesis operation, it was necessary to modify the mechanisms that regulate its gene expression, as well as protein accumulation and activity.Therefore, to engineer the C 4 metabolism, it is crucial to understand the C 4 PEPC regulatory network.
The regulation of C 4 PEPC is complex, being modulated at several levels.At the epigenetic level, patterns of histone methylation were associated with the establishment of cell specificity.However, the mechanisms that maintain this pattern remain unknown.It would be interesting to investigate if there are methyltransferases recruited to the promoter in a cellspecific way, to induce higher levels of histone methylation, contributing to gene activation.If this is true, it would also be important to know which methyltransferases are recruited and the mechanisms underlying this process.Similarly, a deeper understanding of the role of CpG islands for the establishment of cell-specificity of C 4 PEPC gene expression, would also be an interesting topic to investigate.Histone acetylation has been associated with light and circadian regulation and even not being crucial for C 4 PEPC regulation, it may contribute.It would be interesting to investigate if histone acetylation can function as prerequisite to enable C 4 PEPC transcription.In addition, it seems that different photoreceptors, may also be involved in C 4 PEPC transcriptional regulation, since blue and red light induce C 4 PEPC gene expression.In the future, it would be relevant to further characterise the regulatory mechanisms of C 4 PEPC by the different photoreceptors, to better understand C 4 PEPC light response.
To establish cell-specificity, cis-elements and transfactors were recruited during C 4 evolution.Although some progress has been made to characterise C 4 PEPC promoters and to identify putative regulatory cis-elements, there is still a gap regarding the identification and characterization of new trans-factors.It would be interesting to know which TFs bind to MEM1, a crucial cis-element defining cell-specificity in Flaveria species.In monocots, some TFs have been identified as putative regulators of cell-specificity.However, their relevance to establish cell-specificity and to C 4 photosynthesis efficiency still needs to be demonstrated.The identification and characterization of key TFs to establish C 4 PEPC cellspecificity in both monocots and dicots would be crucial to better understand these mechanisms.Furthermore, in both dicots and monocots, there are certainly relevant cis-elements in C 4 PEPC gene promoter, involved in gene expression that remain to be identified.
The circadian regulation of C 4 PEPCs is the most unexplored regulatory mechanism presented in this review.It is known that the circadian clock regulates C 4 ZmPEPC at transcriptional level and its expression is regulated by ZmbHLH80 and ZmbHLH90.Since the Arabidopsis homologue for these two TFs, FBH1, regulates circadian clock through the transcriptional regulation of CCA1, it would be interesting to know if ZmbHLH80 and ZmbHLH90 could be involved in the circadian regulation of ZmPEPC1 and if the regulation of CCA1 is conserved.
Different species have distinct regulatory mechanisms to regulate developmental C 4 PEPC gene expression and protein accumulation, which is not surprising, given that C 4 photosynthesis is a convergent evolutionary event.Despite these differences, in all species, M cell differentiation seems to be important for a high C 4 PEPC gene expression and protein accumulation.However, the regulatory mechanisms underlying leaf development are still poorly understood.In the future, it would be interesting to identify the internal cues involved in establishing M cell specificity along the developmental gradient.
The photosynthetic metabolism underpins the synthesis of carbohydrates needed for plant growth and reproduction.Adverse environmental conditions that negatively affect photosynthesis will impair plant growth and yield.It is therefore important to understand how photosynthesis responds to environmental stresses and find ways to improve such responses.In C 4 photosynthesis, C 4 PEPC plays an important role in carbon fixation, being responsible for the first carboxylation step in the cycle.Because of this role, C 4 PEPC is tightly regulated and responds to environmental stimuli, such as water availability, light, nutritional signals, C 4 PEPC: evolution and regulation 11 and atmospheric conditions.The regulation of C 4 PEPC is poorly understood, but the effects of different environmental clues have been described.The regulation of C 4 PEPC levels in response to stress is important to regulate the carbon flux into the C 4 cycle, thus regulating the photosynthetic efficiency of the plant.It is difficult to distinguish between the role of C 4 PEPC in the C 4 cycle and its role in anaplerotic reactions.Being C 4 PEPC an important enzyme for the C/N balance, its regulation can impact several metabolic pathways, making it a good target for improvement of plant stress response.
In conclusion, C 4 evolution represents one of the most impressive cases of convergent evolution in Nature that has occurred independently over 60 times in very distant species.Nevertheless, their carbon concentration mechanisms always rely on a C 4 PEPC, which is tightly regulated by internal and environmental cues.Since the function of C 4 PEPC in C 4 photosynthesis, combined with its anaplerotic role, makes it an important modulator of plant growth and yield, it is of utmost importance to better understand the gene regulatory network (including its evolution) modulating its expression and function.

Figure 1 -
Figure 1 -Simplified schematic representation of the role played by non-photosynthetic PEPC in the carbon-nitrogen balance.The carboxylation of phosphoenolpyruvate (PEP) is an important step to replenish carbon skeletons to the TCA cycle, re-routing carbon (glycolysis products) into the TCA cycle.The link between the TCA and GOGAT/GS cycles is important for the carbon-nitrogen balance, making PEPC an important regulator of carbon partitioning.

Figure 2 -
Figure 2 -Cladogram representing the amount of PEPC isoform present in plant genomes.Species are organised considering their phylogenetic relationships, with representatives of important evolutionary groups.Sequences were obtained from PLAZA and NCBI databases, using different PEPC protein sequences for BLASTp.Incomplete or unrelated sequences were removed by protein alignment and phylogenetic analysis.Red lines represent C 4 species, blue lines represent CAM species, and black lines represent C 3 species.

Figure 3 -
Figure 3 -Schematic representation of the different mechanisms proposed to regulate the transcription of C 4 ZmPEPC in an organ-and cell-specific way.(A) Regulation of C 4 ZmPEPC gene expression in M cells.The repressors ZmbHLH80 and ZmOrphan94 are less expressed than in BS cells, therefore there is a high gene expression activation by ZmbHLH90.(B) Regulation of C 4 ZmPEPC gene expression in BS cells.ZmbHLH80 and ZmOrphan94 are preferentially expressed in BS cells, working as repressors of ZmbHLH90, leading to a down-regulation of C 4 ZmPEPC expression.ZmbHLH80 and ZmOrphan94 can impair ZmbHLH90 function through heterodimerization or competitive binding for the same binding site.In addition, ZmOrphan94 may also impair ZmbHLH90 through its binding to CACA motifs, close to ZmbHLH90 binding site.In leaves, Dof1 is activated by light, allowing its binding and consequent activation of C 4 ZmPEPC gene expression (A and B).(C) Regulation of C 4 ZmPEPC in stems and roots by Dof1 and Dof2.These TFs are both expressed in these tissues, however, while Dof1 bind to the respective cis-elements in the C 4 ZmPEPC promoter to activate gene expression, Dof2 binds them to block Dof1 DNA-interaction, thus impairing C 4 ZmPEPC expression.The black arrows and the red lines represent activation and repression of gene expression, respectively.The thickness of the green arrow represents the expression levels of C 4 PEPC in each cell type.Activation and repression by the different TFs are represented as blue arrows and red lines, respectively.The different sizes of Dof1, ZmbHLH80 and ZmOrphan94, between A and B denote their gene expression levels in each cell type.The yellow rectangles represent the binding sites of Dof1 and Dof2 (Yanagisawa and Sheen, 1998) and the green rectangles represent the ZmOrphan94 binding sites.The binding site of ZmbHLH80 and ZmbHLH90 (E-box) is represented by a white rectangle.Within this E-box, there is a CACA motif, which is represented by a green rectangle, similar to the other binding sites of ZmOrphan94.The orange lines underneath the promoter represent the CNSs identified by Gupta et al. (2020).

Table 1 -
Histone modifications found in C 4 PEPC gene promoter and regulated processes.

Table 2 -
Summary of the abiotic stress effects in C 4 PEPC levels.